-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for fetching Redshift query results using Redshift unload command #24117
Add support for fetching Redshift query results using Redshift unload command #24117
Conversation
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitManager.java
Outdated
Show resolved
Hide resolved
51fae23
to
4732ea6
Compare
f5be0a8
to
7b7d095
Compare
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftPageSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftSessionProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/test/java/io/trino/plugin/redshift/TestRedshiftUnload.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/test/java/io/trino/plugin/redshift/TestRedshiftUnload.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftConnectorFactory.java
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftPageSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftPageSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftPageSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitManager.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadConnector.java
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftPageSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftPageSourceProvider.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitManager.java
Outdated
Show resolved
Hide resolved
a4868fc
to
a52f29b
Compare
4be8ea2
to
e33c9a4
Compare
https://github.com/trinodb/trino/actions/runs/12462869117/job/34784444202?pr=24359 Redshift tests are passing. |
core/trino-main/src/main/java/io/trino/operator/SplitOperatorInfo.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftClientModule.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/test/java/io/trino/plugin/redshift/TestRedshiftConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/test/java/io/trino/plugin/redshift/TestRedshiftUnload.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/test/java/io/trino/plugin/redshift/TestRedshiftUnload.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftParquetPageSource.java
Outdated
Show resolved
Hide resolved
@@ -64,6 +64,32 @@ documentation](https://docs.aws.amazon.com/redshift/latest/mgmt/jdbc20-configura | |||
```{include} jdbc-authentication.fragment | |||
``` | |||
|
|||
### UNLOAD configuration | |||
|
|||
This feature enables using Amazon S3 to efficiently transfer data out of Redshift |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please describe what this actually does for user .. seems like it is only a performance improvement .. right? And if so .. why is it called unload?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature uses UNLOAD(https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html) approach for performance improvements. Additionally, it requires additional configs to enable this feature. Please suggest text improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So lets move that into the performance section and explain more there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we need to explain that they need a S3 account and whatever else .. and when this should and should not be used.. I am not aware of this info so I cant really reword appropriately without more details
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitSource.java
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/test/java/io/trino/plugin/redshift/TestRedshiftUnload.java
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftParquetPageSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitSource.java
Outdated
Show resolved
Hide resolved
c9a1b30
to
d35c9b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % comments
core/trino-main/src/main/java/io/trino/operator/SplitOperatorInfo.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/test/java/io/trino/plugin/redshift/TestRedshiftUnload.java
Outdated
Show resolved
Hide resolved
d35c9b2
to
93308f8
Compare
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftSplitManager.java
Show resolved
Hide resolved
plugin/trino-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadSplitSource.java
Show resolved
Hide resolved
93308f8
to
c27227d
Compare
Previously SplitOperatorInfo wasn't Mergeable and hence base OperatorInfo(OperatorStats#getMergeableInfoOrNull) was null.
c27227d
to
9582983
Compare
...no-redshift/src/main/java/io/trino/plugin/redshift/RedshiftUnloadJdbcQueryEventListener.java
Outdated
Show resolved
Hide resolved
9582983
to
613cada
Compare
Co-authored-by: Mayank Vadariya <[email protected]>
d0a5473
to
f8fdfb1
Compare
Description
Redshift supports unloading select query results to S3 using UNLOAD command. PR aims to use unload command to unload Redshift query results on S3 and to read generated redshift query results files in parallel.
Official documentation on Redshift unload command:
https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html
https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD_command_examples.html
Connector internally converts Trino SQL to Redshift's unload SQL and runs it. Resultant Parquet files containing query results will be read in parallel to generate Trino query result. Only Parquet file format is supported in Redshift unload command.
Below are certain limitations for which Trino internally force fallback to traditional JDBC even if connector is configured to use unload.
TimeType
andVarbinaryType
column type is unsupported as unload doesn't support unloading to Parquet format.DecimalType
column is unsupported with unload as unload query is generated by Trino doesn't add precision tonumeric
cast function.Benchmarking
Setup information:
Coordinator size:
m6g.2xlarge
Worker size: 8 X
m7g.8xlarge
Redshift cluster size: 2(nodes) X
dc2.large
Benchmarking stats
UNLOAD performs better as the volume of query results increases .
I'd like to thank @raunaqmorarka for his constant support in helping me land this PR.
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: